Separable 2D Convolution with Polymorphic Register Files

نویسندگان

  • Catalin Bogdan Ciobanu
  • Georgi Gaydadjiev
چکیده

This paper studies the performance of separable 2D convolution on multi-lane Polymorphic Register Files (PRFs). We present a matrix transposition algorithm optimized for PRFs, and a 2D vectorized convolution algorithm which avoids strided memory accesses. We compare the throughput of our PRF to the nVidia Tesla C2050 GPU. The results show that even in bandwidth constrained systems, multi-lane PRFs can outperform the GPU for 9× 9 or larger mask sizes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Register File with Transposed Access Mode

2D convolution and 2D transforms, such as wavelet transform and discrete cosine transform (DCT), are widely used in image and video processing. To reduce the computation complexity, these algorithms are often implemented in two separable passes of 1D processing (e.g., row-wise processing followed by column-wise processing). For example, the number of multiplications of a direct N x N 2D DCT is ...

متن کامل

Learning Graph Convolution Filters from Data Manifold

Convolution Neural Network (CNN) has gained tremendous success in computer vision tasks with its outstanding ability to capture the local latent features. Recently, there has been an increasing interest in extending CNNs to the general spatial domain. Although various types of graph and geometric convolution methods have been proposed, their connections to traditional 2D-convolution are not wel...

متن کامل

Two-dimensional cubic convolution

The paper develops two-dimensional (2D), nonseparable, piecewise cubic convolution (PCC) for image interpolation. Traditionally, PCC has been implemented based on a one-dimensional (1D) derivation with a separable generalization to two dimensions. However, typical scenes and imaging systems are not separable, so the traditional approach is suboptimal. We develop a closed-form derivation for a t...

متن کامل

An optimized GPU-based 2D convolution implementation

With the increasing sophistication of image processing algorithms, and due to its low computation complexity, convolution should fully benefit from the ever-increasing capacities of state-of-the-art GPUS, such as Nvidia’s Kepler and Maxwell family cards. Currently, it tends to be used as a preprocessing stage within more intricate image manipulations and has recently been implemented quite effi...

متن کامل

Improved Cubic Convolution for Two Dimensional Image Reconstruction

This paper describes improved piecewise cubic convolution for two-dimensional image reconstruction. Piecewise cubic convolution is one of the most popular methods for image reconstruction, but the traditional approach uses a separable two-dimensional convolution kernel that is based on a onedimensional derivation. The traditional approach is suboptimal for the usual case of non-separable scenes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013